In-class Exercise 3: Advanced Spatial Point Patterns Analysis

Author

Christover Manafe

Published

September 9, 2024

Modified

September 9, 2024

1 Overview

Network constrained Spatial Point Pattern Analysis (NetSPAA) is a collection of specialized methods designed for analyzing spatial point events that occur on or alongside a network. These events could include, for example, the locations of traffic accidents or childcare centers, while the network itself might represent road systems, river networks, or other linear infrastructures.

In this in-class exercise, we will learn practical experience with the key functions of the spNetwork package, specifically:

  • Deriving Network Kernel Density Estimation (NKDE)
  • Performing Network-based G-function and K-function analysis

2 The data

In this exercise, we will analyse the spatial distribution of childcare center in Punggol planning area using following geospatial datasets:

Dataset Description Format
Punggol_St Line features geospatial data which store the road network within Punggol Planning Area. ESRI shapefile
Punggol_CC Point features geospatial data which store the location of childcare centres within Punggol Planning Area. ESRI shapefile

3 Installing and launching the R packages

We will use following packages in this exercise:

Package Description
sf Provides functions to manage, processing, and manipulate Simple Features, a formal geospatial data standard that specifies a storage and access model of spatial geometries such as points, lines, and polygons.
spNetwork Provides functions to perform Spatial Point Patterns Analysis such as kernel density estimation (KDE) and K-function on network. It also can be used to build spatial matrices (‘listw’ objects like in ‘spdep’ package) to conduct any kind of traditional spatial analysis with spatial weights based on reticular distances.
tidyverse Provides collection of functions for performing data science task such as importing, tidying, wrangling data and visualising data.
tmap Provides functions for plotting cartographic quality static point patterns maps or interactive maps by using leaflet API

To install and launch the four R packages.

pacman::p_load(sf, spNetwork, tmap, tidyverse)

4 Data Import and Preparation

4.1 Import

The code chunk below uses st_read() of sf package to important Punggol_St and Punggol_CC geospatial data sets into RStudio as sf data frames.

network <- st_read(dsn="data/geospatial", 
                   layer="Punggol_St")
Reading layer `Punggol_St' from data source 
  `/Users/cham/project/Geospatial-Analytics/chrismanafe/ISSS626-GAA/in_class_ex/in_class_ex03/data/geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 2642 features and 2 fields
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: 34038.56 ymin: 40941.11 xmax: 38882.85 ymax: 44801.27
Projected CRS: SVY21 / Singapore TM
childcare <- st_read(dsn="data/geospatial",
                     layer="Punggol_CC") %>%
  st_zm(drop = T,
        what = "ZM")
Reading layer `Punggol_CC' from data source 
  `/Users/cham/project/Geospatial-Analytics/chrismanafe/ISSS626-GAA/in_class_ex/in_class_ex03/data/geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 61 features and 1 field
Geometry type: POINT
Dimension:     XYZ
Bounding box:  xmin: 34423.98 ymin: 41503.6 xmax: 37619.47 ymax: 44685.77
z_range:       zmin: 0 zmax: 0
Projected CRS: SVY21 / Singapore TM

We noticed that geometry attribute of childcare has 3D point, but we require 2D points for NKDE function. So we need to drop the Z-dimension of geometry attribute.

4.2 Examine data structure

childcare
Simple feature collection with 61 features and 1 field
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 34423.98 ymin: 41503.6 xmax: 37619.47 ymax: 44685.77
Projected CRS: SVY21 / Singapore TM
First 10 features:
      Name                  geometry
1   kml_10 POINT (36173.81 42550.33)
2   kml_99 POINT (36479.56 42405.21)
3  kml_100 POINT (36618.72 41989.13)
4  kml_101 POINT (36285.37 42261.42)
5  kml_122  POINT (35414.54 42625.1)
6  kml_161 POINT (36545.16 42580.09)
7  kml_172 POINT (35289.44 44083.57)
8  kml_188 POINT (36520.56 42844.74)
9  kml_205  POINT (36924.01 41503.6)
10 kml_222 POINT (37141.76 42326.36)
st_crs(childcare)
Coordinate Reference System:
  User input: SVY21 / Singapore TM 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]
network
Simple feature collection with 2642 features and 2 fields
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: 34038.56 ymin: 40941.11 xmax: 38882.85 ymax: 44801.27
Projected CRS: SVY21 / Singapore TM
First 10 features:
     LINK_ID                   ST_NAME                       geometry
1  116130894                PUNGGOL RD LINESTRING (36546.89 44574....
2  116130897 PONGGOL TWENTY-FOURTH AVE LINESTRING (36546.89 44574....
3  116130901   PONGGOL SEVENTEENTH AVE LINESTRING (36012.73 44154....
4  116130902   PONGGOL SEVENTEENTH AVE LINESTRING (36062.81 44197....
5  116130907           PUNGGOL CENTRAL LINESTRING (36131.85 42755....
6  116130908                PUNGGOL RD LINESTRING (36112.93 42752....
7  116130909           PUNGGOL CENTRAL LINESTRING (36127.4 42744.5...
8  116130910               PUNGGOL FLD LINESTRING (35994.98 42428....
9  116130911               PUNGGOL FLD LINESTRING (35984.97 42407....
10 116130912            EDGEFIELD PLNS LINESTRING (36200.87 42219....
st_crs(network)
Coordinate Reference System:
  User input: SVY21 / Singapore TM 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]

5 Visualising the Geospatial Data

We also use st_geometry() to focus purely on the spatial geometries of the object (points, lines, polygons).

plot(st_geometry(network))
plot(childcare,add=T,col='red',pch = 18)

tmap_mode('view')
tmap mode set to interactive viewing
tm_shape(childcare) + 
  tm_dots(col = 'red') + 
  tm_shape(network) +
  tm_lines()
tmap_mode('plot')
tmap mode set to plotting

6 Visualising the Geospatial Data

plot(st_geometry(network))
plot(childcare,add=T,col='red',pch = 19)

tmap_mode('view')
tmap mode set to interactive viewing
tm_shape(childcare) + 
  tm_dots(col = 'red') + 
  tm_shape(network) +
  tm_lines()
tmap mode set to plotting

7 Network Kernel Density Estimate (NKDE) Analysis

7.1 Preparing the lixels object

Before computing NKDE, the SpatialLines object need to be split into lixels with a specified minimal distance. This operation can be accomplished using lixelize_lines() function from the spNetwork package.

lixels <- lixelize_lines(network, 
                         700, 
                         mindist = 350)
Note

In above code, we set the length of a lixel to 700m and set the minimum length of a lixel to 350m.

Additional Notes:

  • If the final lixel after splitting is shorter than the specified minimal distance, it is combined with preceding lixel.
  • If the minimum distance is not specified (NULL), it defaults to \(\frac{\text{maxdist}}{10}\).
  • If the segments that are already shorter than the minimum distance are not modified.
  • Another function, lixelize_lines.mc(), is available, which provides multicore support for processing.

7.2 Generate line centre points

lines_center() of spNetwork is used to generate a SpatialPointsDataFrame with line centre points. The points are located at center of the line based on the length of the line.

samples <- lines_center(lixels) 

7.3 Visualize line center points

tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(lixels) +
  tm_lines() +
tm_shape(samples) +
tm_dots(size = 0.01)
tmap_mode("plot")
tmap mode set to plotting

7.4 Performing NKDE

To compute NKDE:

densities <- nkde(network, 
                  events = childcare,
                  w = rep(1, nrow(childcare)),
                  samples = samples,
                  kernel_name = "quartic",
                  bw = 300, 
                  div= "bw", 
                  method = "simple", 
                  digits = 1, 
                  tol = 1,
                  grid_shape = c(1,1), 
                  max_depth = 8,
                  agg = 5, 
                  sparse = TRUE,
                  verbose = FALSE)
Note

It is recommended to read the documentation of nkde() function to understand various parameters too calibrate NKDE model.

7.5 Visualising NKDE

7.5.1 Insert computed density values into samples and lixels object as density field

samples$density <- densities
lixels$density <- densities

7.6 Rescale density values

Since svy21 projection system is in meter, the computed density values are very small i.e. 0.0000005. We can rescale the density values from number of events per meter to number of events per kilometer using following code:

# rescaling to help the mapping
samples$density <- samples$density*1000
lixels$density <- lixels$density*1000

7.7 Visualise using tmap package

We use tmap package to prepare interactive and high cartographic quality map visualisation.

tmap_mode('view')
tmap mode set to interactive viewing
tm_shape(lixels)+
  tm_lines(col="density")+
tm_shape(childcare)+
  tm_dots()
tmap_mode('plot')
tmap mode set to plotting

Road segments with relatively higher density of childcare centres is highlighted with darker color. Road segments with relatively lower density of childcare centres is highlighted with lighter color.

8 Network Constrained G- and K-Function Analysis

We will perform a Complete Spatial Randomness (CSR) test using the kfunctions() function from the spNetwork package.

Null Hypothesis

H₀: The observed spatial point events (i.e., the distribution of childcare centers) are uniformly distributed over the street network in the Punggol Planning Area.

CSR Test Assumptions

  • The CSR test is based on the binomial point process assumption.
  • This implies that childcare centers are randomly and independently distributed across the street network.

Interpretation of Results

  • If the null hypothesis is rejected, we may infer that the distribution of childcare centers is not random.
  • This suggests that the centers are spatially interacting and dependent on one another, potentially forming non-random patterns.
kfun_childcare <- kfunctions(network, 
                             childcare,
                             start = 0, 
                             end = 1000, 
                             step = 50, 
                             width = 50, 
                             nsim = 49, 
                             resolution = 50,
                             verbose = FALSE, 
                             conf_int = 0.05)
Arguments used
Argument Description
lines A SpatialLinesDataFrame containing the sampling points. The geometries must be valid; invalid geometries may cause crashes.
points A SpatialPointsDataFrame representing the points on the network, which will be snapped onto the network.
start A double indicating the starting value for evaluating the k and g functions.
end A double specifying the final value for evaluating the k and g functions.
step A double representing the step size or interval between two evaluations of the k and g functions.
width The width of each donut ring for the g-function.
nsim An integer specifying the number of Monte Carlo simulations to run. For example, 50 simulations were performed in the given code chunk. Usually, more simulations are required for robust inference.
resolution The resolution for simulating random points on the network. A lower resolution reduces calculation time. If NULL, random points are placed anywhere on the network. Specifying a value splits edges, and random points are selected from vertices.
conf_int A double representing the width of the confidence interval, with a default value of 0.05.

The output of kfunctions() is a list with the following values:

  • plotk, a ggplot2 object representing the values of the k-function
  • plotg, a ggplot2 object representing the values of the g-function
  • values, a DataFrame with the values used to build the plots

Visualise ggplot2 object of k-function

kfun_childcare$plotk

9 References

To-be-updated